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i.e. the removal of metals from contaminated soil or 
aqueous media (Salt et al . , 1998). 

[0013] However, practical difficulties have still to 

be solved in order to efficiently use said 
5 hyperaccumulators , among which are the slow growth rate and 
low growth habit (rosette) of many hyperaccumulators, and 
the specific nature of their metal tolerance (Ernst 1995) . 

Aims of the invention 
10 [0014] The present invention aims to provide 

polynucleotide and polypeptide sequences associated to 

cadmium tolerance and accumulation in plant cells. 
[0015] The present invention also aims to provide 

polynucleotide sequences and regulatory sequences 
15 containing said polynucleotide sequences able to improve 

cadmium tolerance of plant cells, when expressed in 

foreigner organisms. 

[0016] The present invention aims to provide a 

recombinant plant expressing said polynucleotidic sequences 
20 which could be used for phytoremediat ion applications 
and/or for phytoextraction applications. 

[0017] A last object of the present invention is to 

provide such a plant or plant cell or tissue expressing 
said polynucleotide sequence which presents a sufficient 
2 5 growth rate for phytoremediat ion applications and from 
which cadmium can be easily extracted for recycling 
purposes . 

Definitions 

30 [0018] It is meant by "phytoremediation" the use of 

green plants to remove pollutants from the environment or 
to render them harmless. Phytoextraction, phytodegradation, 
rhizof iltration, phytostabilisation, phytovolatilisation 
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and the use of plants to remove pollutants from air (Salt 
et al . , 1998) . 

[0019] Phytoextraction is the use of pollutant- 

accumulating plants to remove metals or organics from soil 
5 by concentrating them in the harvestable parts'. 

[0020] Preferably, said phytoremediation is a 

hyperaccumulation, which means the capacity of said plants 
to accumulate heavy metals in greater quantities than a 
plant normally does. It is meant by "hyperaccumulator" a 

10 plant containing in their aerial parts at least 10 times, 
preferably at least 10 0 times more metals than other plants 
grown on contaminated soil (for cadmium the threshold is 
100 jug/g dry weight (0.01%) ( Ref. Brooks et al . Trends in 
Plant Science, vol.3 no . 9 p.359-362)). 

15 [0021] The term « polypeptide » refers to any 

peptide or protein comprising two or more amino acids 
joined to each other by peptide bonds or modified peptide 
bonds, i.e., peptide isosteres. This term "polypeptide" 
refers to both short chains, commonly referred to as 

2 0 peptides, oligopeptides or oligomers, and to longer chains, 
generally referred to as proteins. Polypeptides may contain 
amino acids other than the 20 gene-encoded amino acids. 
"Polypeptides" include amino acid sequences modified either 
by natural processes, such as posttranslational processing, 

2 5 or by chemical modification techniques which are well known 

in the art. Such modifications are well described in basic 
texts and in more detailed monographs, as well as in a 
voluminous research literature. Modifications can occur 
anywhere in a polypeptide, including the peptide backbone, 

3 0 the amino acid side -chains and the amino or carboxyl 

termini. It will be appreciated that the same type of 
modification may be present in the same or varying degrees 
at several sites in a given polypeptide. Also, a given 
polypeptide may contain many types of modifications. 
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Polypeptides may be branched as a result of ubiquitination, 
and they may be cyclic, with or without branching. Cyclic, 
branched and branched cyclic polypeptides may result from 
posttranslational natural processes or may be made by 
5 synthetic methods. Modifications include ' acetylation, 
acylation, ADP-ribosylation, amidation, covalent attachment 
of flavin, covalent attachment of a heme moiety, covalent 
attachment of a nucleotide or nucleotide derivative, 
covalent attachment of a lipid or lipid derivative, 

10 covalent attachment of phosphotidylinositol , cross -linking, 
cyclization, disulfide bond formation, demethylation, 
formation of covalent cross -linkings , formation of cystine, 
formation of pyroglutamate, formylation, gamma- 
carboxylation, glycosylat ion, GPI anchor formation, 

15 hydroxylation, iodination, methylation, myristoylat ion, 
oxidation, proteolytic processing, phosphorylation, 
prenylation, racemization, selenoylat ion, sulfation, 
transfer -RNA mediated addition of amino of amino acids to 
proteins such as arginylation, and ubiquitination. See, for 

20 instance, PROTEINS - STRUCTURE AND MOLECULAR PROPERTIES, 
2 nd Ed., T. E . Creighton, W. H. Freeman and Comany, New 
York, 1993 and Wolt , F . , Posttranslational Protein 
Modifications: Perspectives and Prospects, pp. 1-12 in 
POSTTRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B. C. 

25 Johnson, Ed., Academic Press, New York, 1983; Seifter et 
al., "Analysis for protein modifications and non-protein 
cof actors' 7 , Meth. Enzymol . (1990) 182 : 626-646 and Rattan 
et al., "Protein Synthesis: Posttranslational Modifications 
and Aging", Ann NY Acad Sci (1992) 663 : 48-62. 

30 [0022] The term "polynucleotide" generally refers to 

any polyribonucleotide or polydeoxyribonucleotide, which 
may be unmodified RNA or DNA or modified RNA or DNA. 
"Polynucleotides" include, without limitation single- and 
double- stranded DNA, DNA that is a mixture of single- and 
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double -stranded regions, single- and double- stranded RNA, 
and RNA that is a mixture of single- and double -stranded 
regions, hybrid molecules comprising DNA and RNA that may 
be single-stranded or, more typically, double -stranded or a 
5 mixture of single- and double- stranded regions. The term 
"Polynucleotide" also includes DNAs or RNAs containing one 
or more modified bases and DNAs or RNAs with backbones 
modified for stability or for other reasons. "Modified" 
bases include, for example, tritylated bases and unusual 

10 bases such as inosine . A variety of modifications have been 
made to DNA and RNA; thus, "Polynucleotide" embraces 
chemically, enzymatically or metabolically modified forms 
of polynucleotides as typically found in nature, as well as 
the chemical forms of DNA and RNA characteristic of viruses 

15 and cells. "Polynucleotide" also embraces relatively short 
polynucleotides, often referred to as oligonucleotides. 
[0023] The term "variant" as used herein, refers to 

a polynucleotide or polypeptide that differs from a 
reference polynucleotide or polypeptide respectively, but 

20 retains essential properties. A typical variant of a 
polynucleotide differs in nucleotide sequence from another, 
reference polynucleotide. Changes in the nucleotide 
sequence of the variant may or may not alter the amino acid 
sequence of a polypeptide encoded by the reference 

25 polynucleotide. Nucleotide changes may result in amino acid 
substitutions, additions, deletions, fusions and 
truncations in the polypeptide encoded by the reference 
sequence, as discussed below. A typical variant of a 
polypeptide differs in amino acid sequence from another 

30 reference polypeptide. Generally, differences are limited 
so that the sequences of the reference polypeptide and the 
variant are closely similar overall and, in many regions, 
identical . A variant and reference polypeptide may differ 
in amino acid sequence by one or more substitutions 
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(preferably conservative) , additions and deletions in any 
combination. A substituted or inserted amino acid residue 
may or may not be one encoded by the genetic code . A 
variant of a polynucleotide or polypeptide may be a 
5 naturally occurring such as an allelic variant, or it may 
be a variant that is not known to occur naturally. Non- 
natural ly occurring variants of polynucleotides and 
polypeptides may be made by mutagenesis techniques or by 
direct synthesis. Variants should retain one or more of the 

10 biological activities of the reference polypeptide. For 
instance, they should have similar antigenic or immunogenic 
activities as the reference polypeptide. Antigenicity can 
be tested using standard immunoblot experiments, preferably 
using polyclonal sera against the reference polypeptide. 

15 The immunogenic ity can be tested by measuring antibody 
responses (using polyclonal sera generated against the 
variant polypeptide) against purified reference polypeptide 
in a standard ELISA test. Preferably, a variant would 
retain all of the above biological activities. 

20 [0024] The term "identity" is a measure of the 

identity of nucleotide sequences or amino acid sequences. 
In general, the sequences are aligned so that the highest 
order match is obtained. * Identify" per se has an art- 
recognised meaning and can be calculated using published 

25 techniques . See , e.g.: (COMPUTATIONAL MOLECULAR BIOLOGY, 
Lesk, A.M., ed. , Oxford University Press, New York, 1988; 
BIOCOMPUTING: INFORMATICS AND GENOME PROJECTS, Smith, D.W., 
ed., Academic Press, New York, 1993; COMPUTER ANALYSIS OF 
SEQUENCE DATA, PART I, Griffin, A.M., and Griffin, H.G., 

30 eds, Humana Press, New Jersey, 1994; SEQUENCE ANALYSIS IN 
MOLECULAR BIOLOGY, von Heijne, G. , Academic Press, 1987; 
and SEQUENCE ANALYSIS PRIMER, Gribskov, M. and Devereux, 
J., eds, M Stockton' Press, New York, 1991) . While there 
exist a number of methods to measure identity' between two 
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polynucleotide or polypeptide sequences, the term 
"identity" is well known to skilled artisans (Carillo, H . , 
and Lipton, D. 7 SIAM J Applied Math (1998) 48 : 1073). 
Methods commonly employed to determine identity or 
5 similarity between two sequences include, but are not 
limited to those disclosed in Guide to Huge Computers, 
Martin J. Bishop, ed., Academic Press, San Diego, 1994, and 
Carillo, H. , and Lipton, D. , SIAM J Applied Math (1988) 48 
: 1073. Methods to determine identity and similarity are 

10 codified in computer programs. Preferred computer program 
methods to determine identity and similarity between two 
sequences include, but are not limited to, GCG program 
package (Devereux, J., et al . , J* Molec Biol (1990) 215 : 
403) . Most preferably, the program used to determine 

15 identity levels was the GAP program, as was used in the 
Examples hereafter. 

[0025] As an illustration, by a polynucleotide 

having a nucleotide sequence having at least, for example, 
95% "identity" to a reference nucleotide sequence is 

2 0 intended that the nucleotide sequence of the polynucleotide 

is identical to the reference sequence except that the 
polynucleotide sequence may include an average up to five 
point mutations per each 10 0 nucleotides of the reference 
nucleotide sequence. In other words, to obtain a 
25 polynucleotide having a nucleotide sequence at least 95% 
identical to a reference nucleotide sequence, up to 5% of 
the nucleotides in the reference sequence may be deleted or 
substituted with another nucleotide, or a number of 
nucleotides up to 5% of the total nucleotides in the 

3 0 reference sequence may be inserted into the reference 

sequence. These mutations of the reference sequence may 
occur at the 5' or 3 ' terminal positions of the reference 
nucleotide sequence or anywhere between those terminal 
positions, interspersed either individually among 
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nucleotides in the reference sequence or in one or more 
contiguous groups within the reference sequence. 
[0026] Fragments of polypeptides are also included 

in the present invention. A fragment is a polypeptide 
5 having an amino acid sequence that is the same as a part, 
but not all, of the amino acid sequence of the 
aforementioned polypeptides. As with polypeptides, fragment 
may be n free -standing'' or comprised within a larger 
polypeptide of which they form a part or region, most 

10 preferably as a single continuous region. Representative 
examples of polypeptide fragments of the invention, 
include, for example, fragments from about amino acid 
number 752 to about amino acid number 103 0 of the 
polypeptide. In this context "about" includes the 

15 particularly recited ranges larger or smaller by several, 
5, 4, 3, 2 or 1 amino acid at either extreme or at both 
extremes . 

[0027] Preferred fragments include, for example, 

truncated polypeptides having the amino acid sequence of 

20 polypeptides, except for deletion of a continuous series of 
residues that includes the amino terminus, or a continuous 
series of residues that includes the carboxyl terminus and 
/or transmembrane region or deletion of two continuous 
series of residues, one including the amino terminus and 

25 one including the carboxyl terminus. Also preferred are 
fragments characterised by structural or functional 
attributes such as fragments that comprise alpha-helix and 
alpha-helix forming regions, beta-sheet and beta-sheet 
forming regions, turn and turn- forming regions, coil and 

30 coil -forming regions, hydrophilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic 
regions, flexible regions, surf ace- forming regions, 
substrate binding region, and high antigenic index regions. 
'Other preferred fragments are biologically active 
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fragments. Biologically active fragments are those that 
mediate the protein activity, including those with a 
similar activity or an improved activity, or with a 
decreased undesirable activity. Also included are those 
5 that are antigenic or immunogenic in an animal or in a 
human . 

Summary of the invention 

[0028] The present invention is related to isolated 

10 and purified genetic sequences from Thl&spi caerulescens , 
said sequence being selected from the group consisting of 
SEQ.ID.NO.l to SEQ. ID. NO. 32 as well as other sequences 
isolated from unknown (micro) -organisms SEQ. ID. NO. 33 and 
SEQ . ID .NO . 34 . 

15 [0029] The present invention is also related to 

genetic sequences which present an homology higher than 
80%, 85%, 90%, 95%, with SEQ.ID.NO.l or SEQ. ID. NO. 5 or 
SEQ. ID. NO. 7 or their complementary strand. 

[0030] The present invention is also related to 

2 0 genetic sequences which present an homology higher than. 

75%, 80%, 85%, 90%, 95%, with SEQ. ID. NO. 3 or SEQ. ID. NO. 33 
or their complementary strand. 

[0031] The present invention is also related to 

genetic sequences which present an homology higher than 95% 
25 with SEQ. ID. NO. 9 or SEQ. ID. NO. 13 or their complementary 
strand . 

[0032] The present invention is also related to 

genetic sequences which present an homology higher than 
85%, 90%, 95%, with SEQ.ID.NO.il or SEQ. ID. NO. 15 or their 

3 0 complementary strand. 

[0033] The present invention is also related to 

genetic sequences which present an homology higher than 9 5% 
with SEQ. ID. NO. 17 or their complementary strand. 
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[0034] The present invention is also related to 

genetic sequences which present an homology higher than 
75%, 80%, 85%, 90%, 95%, with SEQ. ID. NO. 19 or SEQ. ID. NO. 27 
or their complementary strand. 
5 [0035] The present invention is also related to 

genetic sequences which present an homology higher than 
80%, 85%, 90%, 95%, with SEQ. ID. NO. 29 or their 
complementary strand. 

[0036] The present invention is also related to 

10 genetic sequences which present an homology higher than 95% 
with SEQ. ID. NO. 31 or their complementary strand. 

[0037] The present invention is also related to 

genetic sequences which present an homology higher than 98% 
with SEQ. ID. NO. 23 or higher than 99% with SEQ. ID. NO. 21 or 
15 SEQ. ID. NO. 25 or their complementary strand. 

[0038] The present invention is also related to 

polypeptide sequences encoded by the polynucleotide 

sequences mentioned hereabove, their active fragments and 
variants . 

20 [0039] Active fragments or variants of the 

polypeptide sequences according to the invention are 
molecules which present the same activity with one or more 
genetic modifications (such as deletion or addition of one 
or more amino-acids) in the complete sequences mentioned 

25 hereabove, such as naturally occurring allelic variants. 
Such modifications do not modify the above mentioned 
percentage of homology or sequence identity. 

[0040] An example of said fragments is the portion 

of SEQ ID N° i4j starting from aminoacid 719 up to aminoacid 
30 1134 which comprisses the COOH terminal portion of sequence 
SEQ ID N° 4 . Said terminal portion comprising amino acids 
that are able to bind heavy metals. 

[0041] Said variants are also molecules which 

present a similar activity to the polypeptides according to 
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the invention through the same biochemical pathway and 
acting similarly upon the same active site. 

[0042] The polypeptides can be also integrated as 

'"native" protein or are part of a fusion protein or may 
5 advantageously include additional amino-acid sequences 
which contain secretory or leader sequences, prosequences , 
sequences which elute in purification such as multiple 
histidinoresidue or an additional sequence for stability 
during recombinant production (tag His in the C- terminal 
10 sequence) . 

[0 043] Said polypeptides may comprise also marker 

sequences which facilitate purification of the fused 
polypeptide with a sequence as an hexa-hist idine peptide as 
provided in the PQE vector (Invitrogen Inc.) and described 

15 by Gentz et al . , Proceeding National Academic of Science of 
the USA, 1989, Vol. 86, pp. 821-824) or an HA tag or 
glutathione -S transferase. The corresponding polynucleotide 
may also contain non-coding 5' and 3' sequences such as 
transcribed non- translated sequences, splicing and poly- 

20 adenylation signal and ribosome binding sites. 

[0 044] Another aspect of the present invention is 

related to a vector comprising the polynucleotide or 
polypeptide according to the invention, said vector being 
preferably a plasmid, a virus, a liposome or a cationic 

25 vesicle able to transfect a cell and to obtain the 
expression of said polynucleotide by said cell. 
[0045] The vector according to the invention may be 

a shuttle vector for suitable transformation of different 
types of cells. 

30 [0046] A further aspect of the present invention 

concerns the cell (prokaryotic or eukaryotic cell) or the 
plant transformed by or comprising the vector according to 
the present invention and their use for phytoremediation 
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(including phytoextration) of media (such as soils,), 
contamined by heavy metals. 

Detailed description of the invention 
5 • Materia.! and methods 
Plant cDNA bank 

[0047] A cDNA bank from leaves of one of the best Cd 

hyperaccumulator population of Thlaspi caerulescens 
(Roosens et al . Plant cell and Envir. Vol 26, p 1657- 

10 1672) was integrated in the pYX212 vector. The pYX212 vector 
is a yeast/E . coli shuttle vector for expression in S . 
cerevisiae sold by R&D ingenius company (Madison, USA) . 
Insert was under the activity of the triose phosphate 
isomerase promoter, which is one of the strongest 

15 constitutive promoter in yeast. pYX212 is a 2ji plasmid, 
replicates autonomously in yeast, being maintained at 25- 
100 copies per cell. The plant cDNA were cloned between the 
EcoRI and the Xhol sites. The selection marker was URA3 in 
yeast . 

20 Yeast strain used for transformation 

[0048] The Saccharomyces cerevisiae wild-type strain 

used for transformation experiments was BY4 741 ATCC Number 

201388 ( "Yeast Genetic Stock Collection" in the ATCC Global 

Bioresource Center) . 
25 E.coli strain used for transformation 

The E.coli strain used for experiments was DH5 alpha ATCC 

Number 53868. 

Plasmid isolation from yeast and transformation of E.coli 
strain 

30 [0049] Small scale isolation of plasmid DNA from 

yeast for transformation in E.coli. was done according to 
the method , disclosed in Current Protocols in Molecular 
Biology 1993 John Wiley & Sons~ Inc (Chapter 13) . 
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[0 050] Transformation of E. coll was done by 

electroporation according to the method described in 
Current Protocols in Molecular Biology 1993 John Wiley & 
Sons, Inc (Chapter 1) . 
5 Plasmid isolation from E.coli a.nd retrains formation of yeast 

[0051] Plasmid isolation from E. coll was performed 

with the Wizard Plus Miniprep DNA Purification Systems 

(Pr omega) . 

[0052] Transformation of yeast by Li CI. Gietz, R.D 

10 8c Schiestl, R.H. (1995) has been carried out using the 
technique disclosed in Methods Mol . Cell. Biol. 5, 255-269. 

• Results 

[0053] The plant cDNA library of Th.la.spi 

15 caerulescens was screened in the Saccharomyces cerevisiae 
wild- type yeast strain BY4741. The transf ormants were 
plated on minimal medium supplemented with cadmium. From 
4 3 0.000 S. cerevisiae transf ormants , 200 clones growing on 
15/iM cadmium were identified. To confirm the correlation 
2 0 between the cadmium tolerance phenotype and the expression 
of the plant cDNA , plasmids have been rescued and yeast has 
been re-transf ormed . From 200 plasmids, 150 have been re- 
tested and 110 have been reconfirmed by drop tests on 20 jUM 
cadmium and further sequenced. From sequence analysis, 19 

2 5 different non-redundant cDNAs were identified encoding 

proteins displaying significant homology with: 

group I: metal detoxification related proteins: 

■ phytochelatin synthase 1; 

■ 2 different isoforrns of metallothionein type 3 (type 3a 

3 0 and type 3b) ; 

■ metallothionein type 2; 

■ metallothionein type 1; 
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■ metallothionein related protein / 

- group II: metal transport related proteins corresponding 
to Cd/Zn transporting P-type ATPases; 

- group III: sigballing pathway related proteins: 
5 * a heat shock transcription factor; 

■ transcription factor IID; 
group IV: other proteins: 

■ SAM: salicylic acid carboxyl methyl transferase; 

■ chlorophyll a/b binding proteins; 
10 ■ 4 OS ribosomal protein; 

« Photosystem I subunit . 

[0054] 3 proteins were classified in a last group V 

with unknown function. 

[0 055] The results of sequence analysis and 

15 functional classification of said identified cDNAs are 
presented in Table 1 . 

[0056] It should be noted that cDNAs of group II 

correspond to four truncated cDNAs encoding proteins with 
similarity to the C-terminal region of utative heavy-metal 
20 P-type ATPases, also called in the present description 
"CPx-ATPases" . 

[0057] Said results show that the majority of the 

identified cDNAs encode proteins known to have a potential 
role in heavy metal tolerance as metal binding proteins, 
25 metallothioneins and phytochelatins , and heavy metal 
binding domain of putative CPx-ATPases that display 
Zn 2 VCo 2+ /Cd 2+ /Pb 2+ substrate specificity. 

Analysis of cDNAs encoding truncated putative CPx-ATPases 

• In sllico analysis: 
30 [0058] In silico analysis of the previously identified 
cDNAs encoding truncated putative CPx-ATPases showed a 
higher similarity with the C- terminus of A . thaliana. HMA4 
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and these corresponding sequences in T. caerulescens were 
therefore hereafter called u TcHMA4" . 

[00 5 9] The deduced TcHMA4 proteins encoded by cDNAs 71, 

165 and 199 lacked the putative catalytic domain while 
5 keeping the putative heavy metal binding domains. In 
contrast, cDNA 64, the longest isolated, encoded a protein 
which contains the ATP -binding site. 

• Heterologous expression in yeast: 

10 [0060] To confirm and to compare the ability of 

Thlaspi cDNAs 64, 71, 165 and 199 to increase cadmium 
tolerance to S. cerevisiae, BY4741 cells expressing these 
cDNAs were further analysed for their cadmium tolerance 
(FIG. 2: Evaluation of growth in the presence of cadmium. 

15 Transf ormants of the yeast strain BY4 741 containing empty 
plasrnid pYX212 as negative control and pYX212 with Thlapsi 
cDNAs 199, 165, 64 and 71 were grown in liquid minimal 
medium overnight. Cultures were adjusted to A 600 of 1 and 
serially 10-fold diluted in water. 5 jxl aliquots of each 

2 0 dilution were spotted either on non-selective cadmium 
plates or on plates with 2 0 and 4 0 CdS0 4 . After three 

- days of incubation at 3 0°C, plates were photographed. 
Dilutions are indicated at the top of the figures) . 
[0061] Control cells (carrying the expression vector 

25 pYX2 12 ) grew normally in the absence of cadmium but were 
highly sensitive to cadmium and no growth was observed on 
4 0 yM CdS0 4 . 

[0062] Cells expressing cDNAs 71, 165 and 199 were able 

to grow on 2 0 and 4 0 /xM CdS0 4 . 
30 [0063] Expression of cDNAs 71 and 165 afforded the best 

cadmium tolerance. Growth was still observed at dilution 
10 3 (-125 cells / 5 /xl aliquot) on 40 jjM CdSQ 4 . 
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[0064] In contrast, cells expressing cDNA 64 were more 
sensitive compared to cells expressing the three other 
cDNAs and no growth was observed on 4 0 jiM CdS0 4 . 
[0065] Because growth tests with the wild type strain 
5 BY4741 require a high zinc concentration (11 mM ZnS0 4 ) , 
zinc related phenotype was also tested in the zinc 
hypersensitive zrclcotl double mutant. This yeast strain 
lacks two vacuolar transporters (ZNT1 and COT1 , which 
confer Zn resistance by its sequestration into the vacuole 

10 (Li and Kaplan, 1998) ) and was more sensitive to zinc than 
the parental wild type strain (MacDiarmid et al . , 2003). 
[0066] The profile of growth of transformed zrclcotl 

on Zn was similar to the one of transformed BY4741 on Cd. 
Yeast cells expressing cDNAs 71 and 165 showed the best 

15 zinc tolerance. No difference in growth was observed 
between control cells and cells expressing cDMA 64 at the 
used concentrations (FIG. 3: Evaluation of growth in the 
presence of zinc. Transf ormants of the zinc hypersensitive 
zrclcotl double mutant (parental strain BY4741) containing 

2 0 control plasmid pYX212 and pYX212 with Thlapsi cDNAs 199, 
165, 64 and 71 were grown in liquid minimal medium 
overnight. Cultures were adjusted to A 600 of 1 and serially 
10-fold diluted in water. 5 aliquots of each dilution 

were spotted either on non- selective zinc plates or on 

25 plates with 1 and 1,2 mM ZnS0 4 . After three days of 
incubation at 30 °C, plates were photographed. Dilutions are 
indicated at the top of the figures) . 

Cloning of a TcHMA4 full-length coding sequence 
30 [0067] To isolate a full-length cDNA, a RT-PCR 

approach was used. As the cDNA 71 (with the cDNA 165) 
confers the best tolerance to cadmium and zinc when 
overexpressed in yeast, this cDNA was completely sequenced 
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and used as a starting sequence to determine reverse 
primers . 

[0 06 8] Since the highest homology was found with the 

A . tha.lia.na. HMA4 , the T. caerulescens corresponding gene 
5 was named TcHMA4 (SEQ ID NO. 4 (FIG. 1) ) . 

Sequence analysis of TcHMA4 : 

[0069] The amino -acid sequence deduced from TcHMA4 

aligned well with several A . thai i ana HMAs . The TcHMA4 
deduced amino acid sequence displayed 69% identity and 7 6% 

10 similarity with the AtHMA4 sequence. 

[007 0] The TcHMA4 and AtHMA.4 deduced protein sequences 

display the characteristic features of CPx-ATPases in 
addition of the conserved motifs of P-type ATPases (the 
DKTGT phosphorylation motif and the GDGxNDx ATP binding 

15 motif) . 

[0071] Transmembrane (TM) predictions were used from 
various programs together with the hydropathy calculated by 
the Kyte-Doolittle algorithm (Kyte and Doolittle, 1982) , as 
well as with the information from the location of conserved 
20 sequences to predict the locations of transmembrane domains 
in HMA4 . 

[0 072] TcHMA4 as AtHM4 are predicted to contain eight 

transmembrane domains with a small cytoplasmic loop between 
TM domain 4 and 5 and a large cytoplasmic loop between TM 
25 domains 6 and 7 , which are characteristics of CPx-ATPases. 

[0073] The CPx motif (C 361 PS in TcHMA4 ; C 357 PC in AtHMA4) 

was found in the sixth transmembrane domain as well as a 
specific HP (H 445 in TcHMA4; H 44i in AtHMA4) sequence located 
in the large predicted cytoplasmic domain, 3 9 amino acids 
30 downstream of the phosphorylation site. 

[0074] Besides features typical of CPx-ATPases, the 

TcHMA4 sequence also displayed significant differences from 
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those, which it shared with AtHMA.4 . Both TcHMA4 and AtHMA4 
lacked the N-terminal metal associated domain (GMTCxxC) . 
[0075] Nevertheless, both the pfam and PROSITE databases 
recognise a "heavy metal associated domain" in the N- 
5 termini of Tc- and At-HMA4. 

[0076] The presence of a long COOH extension after the 
eight transmembrane domain was another particular feature 
that TcHMA4 shared with AtHMA4 (4 7 8 amino acids for TcHMA4 
and 47 0 amino acids for AtHMA4 ) and to a lesser extent with 

10 AtHMA2 (267 amino acids) . All these three peptides also 
contained three additional cysteine motifs - C(x) 4 C, C (x) 3 _ 
5 CC, CC - and a His rich domain within their extended C- 
terminus which could be involved in heavy metal binding. 
[0077] The His rich domain was present in AtHMAl , where 

15 it was associated with a single CC dipeptide, but in this 
case in the N-terminal domain. The TcHMA4 C- terminal 
fragment corresponding to the cDNA identified during the 
screening in yeast, consisted of TcHMA4 residues 758 to 
118 6 and hence lacked the putative catalytic domains while 

2 0 keeping the putative heavy metal binding domains. These 
could be responsible for the higher tolerance to Cd 2+ 
conferred to yeast that overexpressed that peptide. 

Metal tolerance in yeast expressing truncated and full 

2 5 length HMA4 coding sequences 

® Cadmium tolerance test : 

[0078] To investigate cadmium specificity of HMA4 , 

heterologous expression in S. cerevisiae was carried out. 
The wild type strain BY4741 was transformed with the pYX212 

3 0 vector expressing TcHMA4-C and TcHMA4 coding sequences 

under the control of the strong constitutive TPI (triose 
phosphate isomerase) promoter. Growth was monitored on 
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solid and in liquid media containing various cadmium 
concentrations . 

[0079] Expression of TcHMA4~C allowed S. cerevisiae 

cells to grow in the presence of 15 /xM on solid up to 50 /jlM 
5 CdS0 4 on liquid media, which reduced growth of control 
cells bearing the pYX212 cloning vector. 

[0080] In contrast, cells expressing full-length 

TcHMA4 were far more sensitive to CdSCu than the control 
cells (FIG. 4 Effect of HMA4-C and HMA.4 expression on 

10 cadmium tolerance in two yeast strains. Yeast BY4 741 and 
CM100 cells transformed with the pYX212 plasmid (grey 
columns) and with pYX212 containing the T. caerulescens (a, 
b) and A. thaliana (c,d) 5 ' truncated cDNA, HMA4-C (white 
columns) , and full-length cDNA, HMA4 (black columns) , were 

15 grown in liquid YNB-ura without or with 10 to 50 jiM CdS04 . 
Cells were incubated at 3 0°c for 24h) . 

[0081] To investigate whether the effects of HMA4 

and HMA4-C expression were strain-dependent, another wild 
type strain, CM100, was transformed with the recombinant 
2 0 pYX212-HMA4 plasmids . CM100 strain is much more sensitive 
to cadmium than BY4741 and cadmium tolerance of cells 
expressing truncated coding' sequence as well as cadmium 
sensibility of cells expressing full-length coding sequence 
were confirmed in CM100 yeast strain. 

2 5 [00 82] To compare TcHMA4 with its Arahidopsis 

orthologue, a full-length AtHMA4 cDNA and its truncated 
version coding for the C-terminal portion (residues 767- 
1172) were cloned in pYX212 and expressed in yeast. 
[00 83] Similar phenotypes as those described for 

3 0 Thl&spi sequences were observed in BY 47 41 and in CM100 . 

[0084] Nevertheless, in both yeast strains, the 

TcHMA4-C and AtHMA4 -C peptides showed consistent 
differences in their ability to confer cadmium tolerance. 
The tolerance conferred by AtHMA4-C was lower. This 
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difference was visible at lower concentrations in CM100 
than in BY4741 (at 2 0 fM CdSQ 4 for CM100 and at up to 5 0 /xM 
CdS0 4 for BY4741 (PIG. 4) . 

[0085] These results were confirmed on solid medium 

5 (on 40 M M CdSQ 4 ) . 

[0086] On the contrary, there was no significant 

difference in the enhanced cadmium sensitivity conferred by 
the entire plant HMA4 protein. 

10 Expression of HMA4 in plants: 

[0087] The expression of TcHMA4 was studied in 

planta, in shoots and roots, by Northern blot analysis 
under stringent conditions (FIG. 5 Northern blot of HMA4 
expression in T . caerulescens and A. thai i ana. (a) Total RNA was 

15 isolated from shoots and roots of the hyperaccumulator T. 
caerul escens and the nonaccumulator A. tha.lia.na. Plants 
were exposed to 10 and 100 jitM CdS0 4 for 24h. Northern blots 
equally loaded with 15/zg of total RNA were probed with 
reS pectively 3' terminal part of TcHMA4 and AtHMA4 (±1,2 

2 0 kb) and after stripping with 1SS rRNA as a loading control. 

Expression levels were normalized to 18S rRNA. Results are 
averages (±SE) from three independent experiments. (b) 
Total RNA was isolated from roots of three contrasting 
populations of T. caerulescens different in their cadmium 
25 tolerance and accumulation : Prayon (Belgium) , St Felix de 
Pallieres (France) and Puente Basadre (Spain) . Plants were 
exposed to 100 CdS0 4 for 24h) . 

[0088] In the roots of all tested 3 populations the 

constitutively high expression of TcHMA4 was confirmed. No 

3 0 significant difference in the abundance of TcHMA4 

expression could be detected between these three 
populations by Northern blot. 

Analysis of Thlaspi caerulescens cDNAs encoding 
metallothioneins 

SUBSTITUE SHEET 



WO 2004/078905 



22 



PCT/BE2004/000035 



[0089] Five different MT cDNAs have been identified. 

Four encoded proteins representative of the plant MT family 

(type-1, -2 and -3) while the fifth encoded amino acid 
sequence displaying similarity to invertebrate MTs but not 
5 with plant sequences. Because of the unique' distribution 
pattern of cysteine residues in MTs, according to Cobbett 
and Goldsbrough (Ann. Rev. Plant Biol, Vol. 53 p 15 9- 
182) (2002) , and high sequence similarity with Arahidopsis 
MTs, proteins- encoding Thla.sp± cDKTAs identified were 
10 designated as Thlaspi type- 1 , -2 and -3 met allot hi oneins 

(TcMTs) . The cDNA encoding protein with no homology with 
plant proteins was named MRP, for Metal lothionein Related 
Protein . 

Type- 3 Metallothioneins : 

15 [0090] The cDNAs 10 and 51 are respectively 465 bp and 

463 bp long, encoding both 67 amino acid residues. These 
sequences share 94% nucleic sequence identity with each 
other in the coding region and 92% / 83% in the 3 ' and 5' 
untranslated regions respectively . Amino acid sequence 

20 identity was 85% and similarity 87%. 

Metallothlonein Related Protein (MRP) : 

[0 091] The cDNA 114 is 62 6 bp long and contains a 

coding region of 2 04 bp, with a 8 9 bp 5' and 3 00 bp 3 7 
25 untranslated regions. The open reading frame encodes a 
protein of 68 amino acids. Seven identical cDNA clones 
encoding 68 amino acid protein were isolated during the 
screening . 

[0092] A sequence search indicates that the deduced 

30 protein has homology to invertebrate metallothioneins . No 
homology was found with plants. For this reason, the 
protein encoded by cDNA 114 was named V MRP" for 
Metallothionein Related Protein. 
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[00 93] Actually, the highest homology of MRP was not 

found with another MT, but with ultra high sulphur keratin 
proteins (longer proteins) from human and mouse. However, 
cysteine and serine residues are responsible for this 
5 homology. 

[00 94] The deduced MRP sequence exhibits 

characteristics of MTs with regard to number of cysteine 
residues and molecular size, but its pattern of cysteine 
residues cannot be aligned with cysteines of plant MTs. MRP 
10 does not share the typical feature of plant MT proteins 
which are characterized by the presence of cysteine-rich 
domains in both N- and C- termini, with the central domain 
devoid of cysteines. 

[00 95] The arrangement of cysteine residues in MRP 

15 is peculiar. First, the 16 cysteine residues are 
distributed throughout the polypeptide. The two (in type 1, 
2 and 3 MTs) or the three (in type 4 MTs) highly conserved 
cysteine-rich domains are absent. Secondly, although some 
cysteine residues are arranged in motifs common in plant 
20 MTs, X-Cys-Cys-X, Cys-X-Cys or single Cys residue, others 
appear in an atypical motif Cys 40 -Cys-Cys . 

[00 96] Moreover, the deduced MRP sequence has a high 

serine content (19%) besides the high cysteine content 
(23,5%) . 

25 

Cadmium tolerance test in yeast: 

[00 97] The ability of Thlaspi metallothionein cDNAs 

to increase cadmium tolerance of S. cerevisiae was checked 
using BY4741 cells expressing TcMT cDNAs for cadmium 
30 tolerance test. cDNAs expressed from pYX212 in BY4741, were 
used for a growth drop test on agar medium containing 0, 2 0 
and 40 jiM CdS0 4 . Plasmids carrying the expression vector 
(pYX2l2) or the Thlaspi phytochelat in synthase 1 cDNA 
(TcPCSl) were used as negative and positive controls, 
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respectively. Phytochelatins are known to play an important 
role in cadmium detoxification in plants and were 
previously shown to increase the cadmium tolerance in S. 
cerevisiae (FIG. 6: Trans formants of the yeast strain 
5 BY4741 containing empty plasmid pYX212 as negative control 
and TcPCS as a positive control, and pYX212 with Thlapsi 
cDNAs of interest: TcMT3 a , TcMTSb, TcMT2 , TcMTl , MRP, were 
grown in liquid minimal medium overnight. Cultures were 
adjusted to A 600 of 1 and serially 10-fold diluted in water. 

10 5 fil aliquots of each dilution were spotted either on non- 
selective cadmium plates or on plates with 2 0 and 4 0 fiM 
CdSC> 4 . After three days of incubation at 3 0°C, plates were 
photographed. Dilutions are indicated above the figures. 
Two individual clones of each yeast transf ormants were 

15 analysed) . 

[0098] Ceils carrying the expression vector grew 

normally in the absence of cadmium but were highly 
sensitive to cadmium and no growth was observed on 4 0 /jlM 
CdS0 4 . 

20 [0099] In contrast, cells expressing TcPCSl were 

able to grow on 20 and 40 /iM CdSQ 4 . 

[0100] TcMT3 a , TcMT 3 Jo , TcMT2 and TcMTl cDNAs 

improved cadmium tolerance to the same extent, colony 
growth was observed at all dilutions on 2 0 jiM CdS0 4 . Cells 

2 5 expressing MRP showed the best cadmium tolerance and were 

still able to grow on 4 0 /iM CdS0 4 at the highest dilution. 
TcMT mRNA expression in plants: 

[0101] Expression of TcMT was analysed in three 
contrasting populations of T. caerulescens , namely Prayon 

3 0 (moderately Cd tolerant with the lowest Cd concentration) , 

Puente Basadre (the least tolerant population) and St Felix 
de Pallieres (the most tolerant population) . 

[0102] RNA was isolated from three weeks old plants 

grown in normal medium or treated with 100 CdS0 4 for 
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72h. The full length labelled cDNA of Thlaspi MTs were used 
as probes in northern blotting. 

[0103] The level of TcMT3 transcripts was more abundant 
in shoots than in roots of Thlaspi plants and was not 
5 cadmium regulated. Abundance of TcMT3 transcripts was 
remarkably higher in shoots of St Felix de Pallieres, the 
best Cd tolerant and hyperaccumulator population, than in 
those from Puente Basadre and Prayon. No difference between 
populations was observed in roots. 

10 [0104] No difference in the level of TcMT~l and -2 
expression was found upon cadmium treatment whatever the 
population studied. However marked differences were 
observed between shoots and roots. TcMTl mRNA was abundant 
in shoots and undetectable in roots whereas TcMT2 was 

15 expressed in both shoots and roots with mRNA level slightly- 
higher in shoots than in roots. 

Transformation experiments in non hyperaccumulator plants 
(for example tobacco plants or A. thaliana plants) : 
[010 5] Maximum 4 genes of Thlaspi caerulescens 

20 related to cadmium tolerance will be selected and 
constructions in binary vectors will be made in order to 
overexpress them in cadmium sensitive and non 
hyperaccumulator plants like Arabidopsis thaliana or 
Tobacco plants. Control plants will be transformed with 

2 5 empty binary vectors (for example pBIN19) . 

[0106] The interest for tobacco plants comes from 

the fact that tobacco has no wild relatives in the European 
flora and the use of sterile transgenic tobacco plants is 
already a strategy selected by pharmaceutical firms to 

3 0 overproduce therapeutic molecules in fields (Queyrel, 

2 002) . The transformation of chloroplasts or another cell 
compartment may be used to avoid gene flow. 

[0107] Concerning the obtention and selection of 

transgenic lines, integration of transgenes will be tested 
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by PCR. Overexpression will be analysed by Northern 
blotting, the number of transgene copy will be estimated by 
• segregation analysis and Southern blotting. Homozygous 
lines with 1, maximum 2 copies will be selected among the 

5 best overexpressors since transgene stability is favoured 
by low copy number. Minimum 4 independent transgenic lines 
per construction will be selected for further study. 
[0108] Concerning the characterisation of transgenic 

lines, a growth test in hydroponic and mineral analysis 

.0 will be done as follows: seeds of selected lines will be 
sown and plants will be transferred in hydroponic culture 
where the metal treatment can be precisely and 
homogeneously controlled and roots as well as the leaves 
can be easily harvested. Fresh and dry weight of heavy 

.5 metals -treated and non- treated plants will be measured. 
Heavy metals contents and allocation (proportion in leaves 
and. roots) will be analysed by atomic absorption 
spectrophotometry. Phytoextraction capacities of the 
different lines (measured as the heavy metal concentration 

0 in the shoot multiplied by the shoot biomass) will be 
compared with the control plants and with the original 
hyperaccumulator species . 

[0109] The best transgenic lines can be further 

tested on polluted soils. In the future, the best lines can 

5 be crossed to ameliorate the phytoextraction capacity. 

1 0110] Maximum 4 genes will be selected and 

constructions in binary vectors will be made in order to 
overexpress them in cadmium sensitive and non 
hyperaccumulator plants like Ar&bldapsis tha.lia.na. or 

0 Tobacco plants. Control plants will be transformed with 
empty binary vectors (for example pBIN19) . 

[0111] The interest for tobacco plants comes from 

the fact that tobacco has no wild relatives in the European 
flora and the use of sterile transgenic tobacco plants is 
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already a strategy selected by pharmaceutical firms to 
overproduce therapeutic molecules in fields (Queyrel, 
2002) . The transformation of chloroplasts or another cell 
compartment may be used to avoid gene flow. 
5 [0112] Concerning the obtention and selection of 

transgenic lines, integration of transgenes will be tested 
by PCR . Overexpression will be analysed by Northern 
blotting, the number of transgene copy will be estimated by 
segregation analysis and Southern blotting. Homozygous 

10 lines with 1, maximum 2 copies will be selected among the 
best overexpressors since transgene stability is favoured 
by low copy number. Minimum 4 independent transgenic lines 
per construction will be selected for further study. 
[0113] Concerning the characterisation of transgenic 

15 lines, a growth test in hydroponic and mineral analysis 
will be done as follows: seeds of selected lines will be 
sown and plants will be transferred in hydroponic culture 
where the metal treatment can be precisely and 
homogeneously controlled and roots as well as the leaves 

20 can be easily harvested. Fresh and dry weight of heavy- 
metals- treated and non-treated plants will be measured. 
Heavy metals contents and allocation (proportion in leaves 
and roots) will be analysed by atomic absorption 
spectrophotometry. Phytoextraction capacities of the 

25 different lines (measured as the heavy metal concentration 
in the shoot multiplied by the shoot biomass) will be 
compared with the control plants and with the original 
hyperaccumulator species . 

[0114] The best transgenic lines can be further 

30 tested on polluted soils. In the future, the best lines can 
be crossed to ameliorate the phytoextraction capacity. 
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Remark concerning Table 1 : 

(1) : complete coding sequence cloned; 

(2) : complete coding sequence cloned but partially sequenced; 
5 (3) : troncated coding sequence cloned; 

(4) : troncated coding sequence cloned and partially sequenced; 

(5) : troncated coding sequence cloned in yeast but further completed by 
Rt-PCR and 5' RACE ; 

clone#8 corresponds to SEQ.ID.NO.l and 2; 
10 - clone#71 corresponds to SEQ. ID. NO. 3 and 4; 

clone#10 corresponds to SEQ . ID .NO . 5 and 6; 

clone#51 corresponds to SEQ. ID. NO. 7 and 8 ; 

clone#167 corresponds to SEQ. ID. NO .9 and 10; 

clone#114 corresponds to SEQ . ID .NO . 33 and 34; 
15 - clone#213 corresponds to SEQ.ID.NO.il and 12; 

clone#159 corresponds to SEQ. ID. NO. 13 and 14; 

clone#2 7 corresponds to SEQ. ID. NO. 15 and 16; 

clone#5 0 corresponds to SEQ. ID. NO. 17 and 18; 

clone#169 corresponds to SEQ. ID. NO. 19 and 20; 
2 0 - clone#92b corresponds to SEQ. ID. NO. 21 and 22; 

clone#65b corresponds to SEQ. ID. NO. 23 and 24; 

clone#82 corresponds to SEQ. ID. NO. 25 and 26; 

clone#79 cox-responds to SEQ. ID. NO. 2 7 and 28; 

clone#62 corresponds to SEQ. ID. NO. 2 9 and 30; 
2 5 - clone#215 corresponds to SEQ. ID. NO. 31 and 32. 
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CLAIMS 

1. An isolated and purified polypeptide 
useful in phytoremediation, presenting more than 40%, 50%, 

5 60%, 70%, 80%, 85%, 90% or 95% sequence identity with the 
sequence SEQ.ID.NO.4, its variants and active fragments 
thereof . 

2 . The isolated and purified polypeptide 
sequence according to claim 1 wherein the sequence is 

10 isolated and purified from Thlaspi caerulescens. 

3 . A polynucleotide sequence encoding the 
polypeptide sequence according to the claims 1 or 2 . 

4. The polynucleotide sequence according to 
claim 3 further comprising, operably linked to it, one or 

15 more adjacent regulatory sequence (s) . 

5. The polynucleotide sequence according to 
the claim 4 which is a sequence presenting more than 40%, 
preferably 50%, 60%, 70%, 80%, 85%, 90% or 95% sequence 
identity with SEQ.ID.NO.3, its variants and active 

20 fragments thereof. 

6. The fragment of the polypeptide of claim 1 
or 2 having an amino acid sequence starting from the amino 
acid 719 up to amino acid 1134 of SEQ ID NO. 4. 

7 . A vector comprising the polynucleotide 
25 sequence (s) according to claim 3 or 4 . 

8. A recombinant host cell or plant 
transformed by one or more polynucleotide sequence (s) 
according to claim 3 or 4 or the vector according to 
claim 7. 

30 9. The recombinant host cell according to 

claim 8, which is selected from the group consisting of 
bact eria (12. coli) or fungi, including yeast. 

10. The recombinant host cell according to 
claim 9, said host cell being S . cerevisia.e . 
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11. The recombinant host cell according to 
claim 8, said host cell a plant cell. 

12. The recombinant host cell or plant 
according to claim 8, which is selected from the group 

5 consisting of Arabidopsis thaliana, tabacco, plants of the 
Brassicaceae family, and of the Caryophyllaceae family. 

13. A method for the phytoremediation 
treatment of a medium, preferably a soil, contaminated by 
heavy metals, preferably cadmium, said method comprising 

10 the step of cultivating upon said contaminated medium a 
genetically transformed plant according to the claim 8 or 
12 . 

14. The method according to the claim 13 
wherein said phytoremediation is a phytoextraction 

15 treatment of the medium which comprises the step of 
recovering and destroying said cultivated plant and or the 
step of recovering said heavy metals from said cultivated 
plant . 
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^clone #8 (pfrvto chelate syrvtnase 1) 

GGCAC^AGGGT^AGTGT^CGTAAi^GTTCTTCTTCCTGCTTTtWTTCTC 

GGATCCAGTi^GGAATCTCCC^GCGCAaGTTTGTTCTTCTCTGT^TTT 

AGTTTGTATCGGAGATCTCTTCCATCTGCTCCGGCGAtTGaCTTTTCTTC 
TGCCC»GGGA2tf^GCTAATCTTC^ 

^GGGTTTTTCAGGTTGATTTCTTATOTCGMACGCAGtTCCGAACCTGCG 
-TTTTSTGGTTTGGCTAGCCTT^CTCTOGTGTTGAATGCTCTTTCTATTGA 
T CCT GGACG&AAAX (5GAAAGGGCCTTGGAGGTG GXTTGATGAATC&AT GC 
TGGATTGCTGCGAGCCACTGG^GTAGTG^AGGATAAGGGAATTTC^TTT 
GG&SAAGTCGX GTGXX TGGCTC ATTGTTCaGGAGC2W^vAGTGSAAGCtTTT 




TTTGAGCAGACTGGGTCTGGXCACTTTTCACCtA^GTGGCTAT^TGC 
TGA^GAGATATGGCTCTGATTCTYGA'TGTTGCCCGTTTCAAGTA^CCXG 
CXCACT GGAT^CCTCX TA&kCTTC XXTGGG A&GCCATGGACAG CftXXGAX 
GAGACAACAGGG^AACGTAGAGGGTTCATGCT CAXAXCX&G&C CGCAC&G 
AGAACCTGGATXGCTCTATACTCTG^GCXGCA^GGATGAM.GCTGGATCA 
GCATAGCCCAGT^TTXGA^GGAAGArGXXCCTCGTCTTGTAAGTTCAGAG 
A^TGTAGATTCTGTGG^AAai^TCGrATCAG7TGXG7TC^TTCACTTCC 

ctcaaaactct^ccaaxtcatcagatgggtggctgaggxcag^ta^crg 

A^GACACA^CAAAAATC^ CAGCGCCGAGC^GAAAXGGftGG GT GA&GT XA 
5UkGC AAGX G GTGCT <3ASAG SAG T GCAGGAAkCX GAACTGTT CA&ACACGT 
TAG TA&GT AX T TGTCC TCAGTGGGXXACGAGGAC AG X C XGGCATAT GCAG 
CTGCAAAGGCTXGXTGCCAftGGAGCTGAAATCTTGTCGGGAACCTCGXCA 
AAAGAGTTCTGTTGTCGGGA^CTXGXGTGA^XGCGTCAAAGGTCCTGA 
AGAGGC^G^AGGCAGGGTGGTGACXGGAGTXGTGGTGCAXGACGGC-AGTG 
AACAiyUlAAXTGAXCXTXXGGXGGCAXCGACCCAAACAG^CTGTG2^XGX 




TSCi».GAAATGAAGCAGCTCRTTTCCATt^CTTCCCTCCCAACTATG^TT 
C^GAAGAGG'TA'rTGCa'rCTTCGACGTCaACTTCftGCTGTWJiACGATG 
XC^GAGaACi^GGA^GGAaGATCTCGTTGCTCCTGCCTyTTG&TTCT 
TCT^CCCAAATTCACACTCTTCTTCCCCAATCGAA'TCCCGGTTTTTTTSA 
^ATAAAACCGTAATTGTAAGAGAGTATTTTATTT'TCCGTATGATATTCAa 
ACTCTATTTGCAGTGAGAGAGATCTGTATCCTATATAA'TAAXASAGTXAT 
AAAACCATTATCATCCCAA^J4AARAAAAAAA5W?iAAAAA 

SEQ. ID. No. 2 , , . 
>c'lon<3 #6-QKot (piiytochelatin svnttoase 1) 

MAMASLYRRSL&SPPAiDFSSASGKLI^EALQKGTSffiGS-FPXISirQTQSEPAFCGLAS 
LSVVLMALSXDPG1lK5JKGPTOWFOESt3LDCCE?LEV>TCDKGrSFGKWCI^CSGAKV^ 
E3.TNQSTI DWFRN FWKCATS DNCHKISTYIffiGVFBQT GSGHFSP2 GGYMASKPMALILD 
VARFJCf P PEWI PLfCLLWSAMDS IDSTTGK?,aGFMLI SKPHSSPGLIYTLSCKDES WI S I A 
OrLKSDVPRl.VSSENVDSVSKIVSV^-rNSLPSKI.NQriRWVAEVRITEDTWKN15AEEKS 
oT.OT.vnt.n?T.Tfwroir'Fr.T,irKHVS KYLSSVGYEDSLAYAAAKACCQGAEII'SGTSSKEFCCR 




ETCVKCVSsGFSEauiiUTV VXLjV v v , ^vws&w^.i-^jj-'"**-'-'->4- 1 -'-*'--"-*"-'»--"" 

MIALPAQT WS GIKDQ AFMQEMKQL2 SMAS 3j P TMLQ EEVXiHLSK'QLQI'LKilCQENKKiuEDL 

VAPAF 
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SEQ . . ID . NO • 3 

>clone#71 (fragment of potential Cd/Zn transporting 'P-type ATPase) 
GAAAAGCCAAGAGAAGGT GATGTTGATGAGACCAGCT AGTAAAACCAGT C 
TGACCATCTTCACTCTGGTTGTTGTGGTGAAAAGAAGCAAGAGAGTGTAA 
AGCT T GT GAAAGAT AGCT GCTGCGGTG AGAAAAGT AGGAAACC AGAGGG A 
GA.TATGGCTT C ACT GAGC T CATGC AAGAAGT CTAACAAT GACAT GAAAAT 
GAAAGGT GGT T C AAG T T G T T GT GC TAGTAAAAAT GAGAAGCT GAAGGAAG 
CAGTAGTAGGAAAGAGCTGCTGTGAAGACAAGGAGAAAACAGAGGGAAAT 
GTT GAGAT GC AGAT TCCAAATTTGGAGAAAGGGT C GCAGAAAAAGGT T GG 
TGAAACC T GCAAAT CAAGC TGTTG TGGAGATAAAGAGAAGGCTAAGGAAA 
CAC G TT T GT T GCTT GC TAG T GAGGATCCAT CTTAT C T GGAGAAGGAAGAA 
AGGC C AAACT ACNT GAAGC TAAGAT TGT GNCCAG T GAAACAGAGCTGCCA 
TGAGAAGGC AAGTC T GGAC AT TGAAAC T GGAGT T AC T TGTGAT C TCAAGT 
TGGTCTGCTGCGGAAACATAGAAGTGGGAGAGCAATCTGATCTTGAGAAA 
GGCATGAAGT TAAAGGGT GAAGGACAAT GCAAGT C TGAC TGCT GC GG TGA 
TGAAATACCT C TAGC TT CT GAGGAAGACAG T GT GGATTGCT CC T CCGGAT 
GC T GCGGAAACAAGGAGGAAT TGACA.CAAATCT GT CAT GAGAAGACAT GT 
CTGGACATTGTAAGTTGTGATTCCAAGTTGGTTTGCTGTGGAGAAACAGA 
AGT GGAAGT GAGAGAGC AAT GTGAT C T C AAGAAGGGTC T GC AGAT AAAGA 
ATGAAGGACAAT GCAAGT C T GTTCGTTGCGGTGATGAAAAGAAAACAGAG 
GAGATAACTGAAGAGACGGACAATCTGAAAAGTGAAAGTGGTGATGATTG 
CAAAT CT CT T T GTT GT GGAACTGGT T T GAAGCAAGAAGGGTCTT CTA.GT T 
T GGT CAAT GTT GT GGT GGAGAGTGGT GAAT CCGGGT CAAGCT G T T GCAGC 
AAGGAGGGAGAGAT AG TGAAAGTC T C TAGCCAAAGC TGT T GC GCAAGT C C 
AAGTGATGTGGTGTTATCTGACTTGGAAGTCAAGAAACTAGAGATTTGTT 
GCAAAGCGAAGAAGACTCCAGAGGAGGTTCGTGGA.TCTAAATGTAAGGAA 
ACAGAGAAGC G TCAC CAC GT T GGTAAAAGC T GTT GCAGGAGTTAT GCAAA 
AGA.G TAT T GCAGC CACAGGCAT CAC CACCAC CACC ACCAC CACCATGT T G 
GGGCTGCTTGA 
SEQ. ID. No. 4 

>clone#71-prot (fragment of potential Cd/Sn transporting P-type ATPase 
MASLS SCKKSNNDMKMKGGS S CCASKNEKLKEAWAJCSCCEDECEKTEGNVEMQIPNLEKg" 
SQKKVGSTCXSSCCGDKEKAKETRLL:^^ 

LDIETGVTCDLKLVCCGNISVGEQSDLEKGMKLKGEGQCKSDCCGDEIPIASEEDSVDCS 
SGCCGNKEELTQICHEKTCLDIVSCDSKLVCCGETEVSVREQCDLKKGLQIKNEGaCKSV 
RCGDEKKTEE I TEETDNLKSES G DDCKS LCCGT GLKQEGS S S LVNVWESGE3GS S CCS K 
EGE I VKVS S QS C CAS P S D VVLS DLE VKKLE I CCKAKKT PEEVRGS KCKE TEKRHHVGKS C 
CRSYAKEYCSHRHHHHHHHHHVGAA 
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SEQ- ID. No. 5 

>clone#10 (iuetallothionein type 3) 

GGCACGAGGC GAACATAC ACAAGAAC TAAAACAAT CTTT CAAGC T TT T T T 
GTT CTAAAAAAAC CAAT C ATG T CGGA.CAAGT GCGGAAGC T GCGA.CTGTT G 
TGA.CAAGACC C AG T GCGT CACGAAGA.GTAC CAGCTACAC C TTGGACATGG 
TCGAGACTCAGGAGAGCTACAAGGAGGCCATGAACATGGACGTTGGTGCA 
GAAGAGAACGGGTGCAAATGCATGTGCGGCTCTACCTGCAGCTGCGTCAA 
CGGCACTTGCAGCCCCAACXAAAJ^GAiUVAGGCTCCTAJ^GACCTTAAAAC 
AGGGC CA.T TT CT CT T TTC C TGCT T TT AT CAAAATG T A& T AT GAAT AAAAG 
TAGAT GT GAG C CACATCT C TC T C TCT C TTAT TATAT GTAAT TCAGACT CT 
CTAC TAT GGC G T GA.T GTAATTGGT T TAT GGC CCCT TAT CCT C TAATAT AC 
AT CAT CT TAT GAT C TAAAAAAAAAAAAAAAJIAAAAAAAAAAAAA 
AAAAAAAA 
SEQ. ID. NO. 6 

>clone#10-prot (metallothionein type 3) 
MSDKCGSCpCCDKTQCVTKSTSYTLDMVE^ 



SEQ. ID. No. 7 

>clone#51 (metallothionein type 3) 

GGC AC GAGGAGAACT CGAAC AT A.C AC AAGAACT AAAACAAT CTTT CAAGC 
TTTTTTCTT CTA AAAAAAC CAAT CAT GACTG ACAAGT GCGGAAGCT GCGA 
CT GT GCTG ACAAGACCCAGT C CGT CAAGAAGAGTACCAGC TAC ACCTTGG 
A.C AT G G T C GAG AC T C AGGA.G AGC T AC AAGG AG GAC AT GAAC AT G GAC GTT 
G TT GCAGAAGAGAAC GGGT GCAAAT GCAAGTGCGGCT CTAC C TGCAGC T G 
C C T C AACT GC AC T T GC GGC C CAAAC T AAAAAAAGGACC T TAAAAAAGGGG 
C CAT T T C T AGT T T CAT C C T T T GAT C AAAAT GT AAT AT G AAT AAAAGT T G A 
T G^ GAGCCAC AT C TCT C TCT TAT T AAAAAT GT AAT TCAGACTCT TC ACT A 
TGGCGTGAT GTAAAT TAGT T TAT GGC CCCT TATCCTCTAATATACATCAT 
CTTATTAT C TAT TAAAAAAAAAAAAAAAAAAAAA 

SEQ. ID. No. 8 
>clone#51-prot (metallothionein type 3) 
MTDKCGSCDCADKTQSVKKSTSYTLDMVETQ 



SEQ. ID. NO. 9 

>clone#167 ( (metallothionein- like protein type 2) 
GGCACGAGGTT C GAAT TT T CTAGAGAAAATC TC T T G CTGTGGAGGAAACT 
GTGGTTGCGGATCTGGCTGCAAGTGCGGCAACGGATGCGGAGGTTGCAAA 
AT G TA.C C CAGAC TT GGGT T T CT C T GGTGAGACCACCACCACCG AGAC TC T 
TGTCCTCGGCGTTGCCCCGGCGA.TGGACTCCCAGTACGAGGCTTCCGGCG 
AGACCT TC GT T GCCGAGAAT GAT GC CT GCAAAT GCGGAT CTGACT GCAAG 
TGCAACC C T T G TACCTGC AAATGAACAACCCAT AAACC CTAAGAGT CT GC 
AATAACCCTAATGTTATGTTA.GGTCTGGTTATGTGTAATAATGGCTGATT 
TCGCCGGTTGTTTTGCCGGTCCTTCTCTTCTTCTGCTGTGTGTTTTTATG 
GT T T GG T CATAANATAT CGCT GCAC GTT T TATC TATGT GACTAT ATAAT C 
AAAT AT TAT TAT GGGTTTGTTTT CNAAAJUIAAASAAAAAAAAAAA 
SEQ. ID t . No. 10 

>clone#167-prot ( (metallothionein-iike protein type 2) 
MSCCGGNCGCGSGCKCGNGCGGCKMY"PDLGFSGETTTTETLVLGVAPAMD 
SQYEASGET FVAENDACKCGS DCKCNPCTCK 
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SEQ. ID. No. 11 

>clone#213 (metallothionein-like protein) 
GGCACGAGGGCAAAAGAAGAAT CANACAACAANAAACTACAAAGTT TAAT 
CAAAGAGAAGTAAGAGAAACAAT GGCC GGTT C TAAAT GT GGTGACT CTT G 
GAGT T GCGAGAT GAAC TACAACACGGAGT GC GACAGC T GC AGC TGT GGAT 
CAGACTGCANC T GTGGG TCNAAC T GCAAC T GT T GANAAATNGTGGT T TAA 
, AAT CACAT GT AT GCA„GGAAAAACT GGGGAAAAATAT GTTAANANAT CCGN 
GTGTGTTTTGAATAATTCTCTTNACCTTGACTTATTTCCTGCTTTGTATT 
TNTNCTGTTNGTTGA 
SEQ . ID. No ; 12 

>clone#213-prot (metallothionein-like protein) 
MAGSKCGDSWSCEMNYNTECDSGSCGSDCXCGXNCNC 



SEQ. I'D. 'No. 13 

>clone#15 9 (heat shock transcription factor) 
GGC AGGAGGCT GAAGT GAT CC AATT GAAACT T TCTT T GGT T C T CAAGTCT 
CTTTGTCCTGTTTTTTTCTGAGTGGTGTGTGAATTGTAAGCTTTTGTTAA 
GAG T AAGAGTT T T GA.GAAAAT TGT GGTT T T GAGAGATGGAT GAGACTAAT 
CAT GGAGGT TCAACAAGCT CACT CCCACCTT TCCTCACCAAAACA.TATGA 
GATGGTTGACGACTCTTCATCGGACTCAATCGTCTCGTGGAGCCAGAGCA 
ACAAGAGC T TC AT CGT T TGGAAT C C T C CAGAGT TTT GCAGAGATCT T CT T 
CC GAGAT TCTT CAAGCAC AACAAC T T C T C AAGCT TT AT CC GT CAGC TTAA 
CAC ATAT GG TT TTAGAA?\AT C TG AT CCCGAGCAATGGGAATT T GC GAACG 
ATGATTTCGTGAGAGGCCAAC CT CATCT GATGAAGAACATTCACAGACGC 
AAAC CAGT T C A.CAGC CACTCT TTAC CTAAT CT CCAAGCT CAGCAAAC TC C 
GTT GACGGAT T CGGAGC GAC AGAGGAT GAATAAC CAAAT CCAGAGAC T T A 
CAAAGGAGAAAGAAGGACT GC TCCAAGAGT TACAGAAACAAGAGGAGGAG 
C GT GAAGGGT T T GAGCAAC AAGT TAAAGAGCTAAAAGAT CGT T TACAACA 
CAT GGAGAAGC GTCAGAAGAC GAT GGTT T C GTAT GT C T C T CAGGTAT TG G 
ATAAACCA 

SEQ. ID. -No. 14 

>clone#15 9-prot " (heat shock transcription factor) 

MDETNHGGS T S SL PP FLTKT YEMVDDSS S DS I VSWSQSNKS FI^/WNP PEFSRDLLPRFFK 

HNNFS S FI R.QLNT YGFRKS D PEQWE FANDD FVRGQPELMKNI HRRKPVHS HS LPNLQAQQ 

TPLTDSERQRMMQIQRLTKEKEGL^ 

VSYVSQVLDKP 



SEQ. ID. No. 15 

>cione#27 (putative glucosyltransf erase) 
GGCACGAGGGT GGT CAGAC CTT CAGAGGAGGCAAAACT CC CAT TA_GGGTA 
TCTT GAGAC AG T GAATAAAGACAAGAGC T TGGT CTTGAAA.T GGAGT C C T C 
AGC TTGAAGT T T TA.T C CAACAAAGCCAGT GGAC CGAT CAAC C GA.T GAAC G 
CAAAGTACAT AC A^ GATG T GT GGAAGTG T GGAG TTCGTGT GAAGATAGAC 
AAAGAAAGT GGGAT T GC C AAGAGAGAGGAGAT T GAAAT TAGTATAA AGG A 
AG T GATGGAAGGAGAGAC GAGCAAAGGGAT GAAGGAA&ACGC AAAGAAAT 
T GAGAGAC TTGGCTGT CAAGT CAC T CAAT GAAGGAT G CT CTACAGATAT 

SEQ . ID_._No. 16 
>cione#27-prot (putative glucosyltransf erase) 
S FIQQSOWTDQPMNAKYIQDWKCGVRVKI DKES GIAKREEIE IS IKEVM 
EGETSKGMKENAKKIaRDLAVKSLNEGCSTD 
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SEQ. ID. No. 17 

>clone#50 (transcription factor II) 

GTAGGGTTACaACGGGGACTCCGCAGTAGTCGCTCTCCGATCCCTTCTTC 
TCCCGGCC^AAATCCGrCTAAACTTCTCTTCCTCAGCATCGATTGCCTCG 
TCTCAGCTCAATTCTCrACGTTTTCACGTTACTGCTTCGTTTAGAACCTT 
CACTTGAGTACTTTGGTGGTGGGAGAGATGAATCACGGCCAACAATCTGG 
CGAGGCAAAGCATGAAGATGACGCTGCGCTTACAGAGTTCCTTGCTTCTC 
TTATGGATTATACTCCTACTATTCCTGATGATCTAGTGGAGCACTACTTG 
GCTAAGAGT GGGT T T CAGTGC C CC GACGTTCGAT TAATAAGGC TAGT T GC 

TGTGGCTACACAAAAGTTTGTTGCTGATGTTGCCAGCGACGCCCCOTCAG 
CAC T GCAGGC TAGACCAGCACCCAGT T GT TAAAGACABAAAAC AGCAAAA 
GGATAAGCGT TT G ATAT T GACAATGGAAGACC TT T CAAAAGCT T TGCGTG 
AG TAC GGT GT GAAC GT GAAGCAT CCAGAAT AT TT TGC TGATAGC C CTT CG 
AC CGGAATGGA.TCC TGCGACAAGGGACGAATAGAAAC CTGAGGAAGTC T T 
T G CCTAGAAAGGAT GAT CATGT AT G T GAGAT CCGT GATT T T C T AT CGT GT 
T T CAGT TAAAACAAACAAAACT CAATTC TA.T T CCTAGTCAC CAG TTACGT 
GTATATTGCTTTTGTTGTTCGTTTCTTGACTTGCGTCTCTGGTTTCCTAC 
AACAC T TAT C T TTC ATTC T TG TAAGTC T T CAAAT C GTGATAATAAGATAA 
GTAT C C T TAT GAGT T TTAAAA 

SEQ. ID. No. 18 
>clone#50-prot L (transcription factor II) 
MNHGQQSGEAJECHEDDAALTEFLASLMDYTPTIPDDLVEHYLAKSGFQCPD 
VRL IRL VAVAT QKFVADVAS DAP SALQARPAPSC 



SEQ. ID. No. 19 

>clone#169 . (S-adenosyl-L-methionine : salicylic acid carboxyl methyltransf erase- 
like protein) 

GGCACGAGGTAAT TC TC C T CTAAT CCTAT CAC TAAT T GATAAG TAC GATA 
CAAAAAAT GGAT TCAAGAT T T ATC GAC ACCAT T C CT T C C TTGAGCT ATAT 
TAAT GACGATAAGAGT GAT GAT GAATAT GCGT T T GT GAGAGC T T TACGTA 
TGAGT GG T GG T GAT GGAGC CAACAGCTACT CC GC C AATT CT CT T C T T CAG 
AGAAGAGT T T T AT C AAT GGC C AAAC CAGTAT T GG T AAAAT AC AC AGAAGA 
AAT GAT GAT GAACT TAGAC T TTCC AAAGTACATCAAAGTTGCT GAAT T GG 
GTTGTTCTTCGGGACAAAACTCATTTCTGGCTATCTCTGAGAT CATCAAT 

AC CAT CAAT AT GTTG TGCC AACAAT CGAACCAAAAC CCACCAGAAAT C GA . . • 

TTGTTGTCTGA AT GATC TT C CGGGAAAC GATT T CAACAC GAC C TT CAAGT 

TC GT ACC T T T C T T CCACAAGAAGC T CAT GAT CACAAAC AGAAC AT C GT GT 

TTCGTCTATGGAGCTCCAGGGTCCTTCTACTCTAGGCTCTTCTCTCGCAA 

TAGCCTCCATTTCATACACTCCTCTTACGCCCTCCATTGGCTCTCTAAGG 

TTC CTGAACAA.CTCGAGAACGATGAGGAAAATGT GTACATAACAAGCTCA 

AGT C CTCAAAGT GC ATACAAGGCT T ACT T GAAT CAAT T C CAAAGAGAT T T 

CAC CATGT T TCTAAGGTT ACGTT CTTGAGAAGTT GTCTCTAA 

SEQ. ID. No. 20, , . 1 . _ . . 

>clone#169-prot (S-adenosyl-L-methiomne : salicylic acid carooxyl 

methvltransf erase-like protein) 

MDSRFIDT I P S LS YINDDKS D DEYAFVRALRMS GGDGANS YSANSLLQRRVIiSMAKPVLV 
KYTEEMMMNLDFPKYIKV^^ 

LPGNDFNTTFKFVPFFHKKLMITNRTSCFVYGAPGSFYSRLFSRNSLHFIHSSYALHWLS 
KVPEQLENDEENVYITSSSPQSAYKAYLNQFQRDFTMFLRLRS 
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>clone#92b (Chi A-B binding protein) 

GGCACGAGGAAGTTATCCACTCAAGGTGGGCCATGCTCGGAGCCCTAGGC 
TGCGTCTTCCCGGAGTTGTTGGCCAGGAACGGAGTCAAGTTCGGA.GAGGC 
GGTGTGGTTCAAGGCCGGTTCGCAGATCTTCAGCGAAGGAGGGCTCGATT 
ACT T GGGAAAC CCAAGCTT GGT T CACGC T CAGAGC ATTT T GGC GATAT GG 
GCC ACT CAGG T GAT CTTGAT GGGAGCT GT TGAAGGT TACAGAGT C GCUGG 
AaACGGGCCGTTGGGAGAGGCCGAGGACTTGCTTTACCCAGGTGGCAGCT 
TCGACCCATTGGGCCTCGCTACCGACCCAGAGGCCTTCGCGGAGTTGAAG 
GTC AAGGAGC T CAAGAACGGAAGATTGGCTAT GXT C T C TAT GTT C GGAT T 
CTTCGTTCAAGCCATCGTCACCGGTAAGGGACCAATCGAGAATCTTGCTG 
ACCATT T GGC C GAT C CAGT C AAC AACAAC GCT T GGGC CTT C GCCAC CAAT 
TTGGTTCCCGGAAAGTGAGCCAAGTTTTTATCTGTTTGTAATTTGTTTTT 
CT T T GC T T CAGT CT T TTGAATT C GAGT GAGAG T GAGGTAAGAGG AGAAAG 
AGTAAAAGGT T T GTGT T GGT GAT GAT GGAT GGTT GAGACTT T CAGATGTA 
AAT T T GT AAGACCT T GTAT GGCT TATCAT TAATCAAATAAC TCGTTTTTC 
TCAAAAAAA*yiA£AAA&AAAAAA^ 

SEQ. ID, No. 22 

>clone#92b-prot ''(Chi A-B binding protein) 
HEEVIHSRWAMLGALGCVFPELIARNGVK^ 

S ILAIDvAT QVI LMGAVEGYR.VAGNGPLGEAE DLL YPGGS FD P LGIjAT D PEAFAELK7/KEL 
KNGRIAMFSMFG FFVQAI VT GKGP I ENLADHL AD P VNNNAWAFATNL VP GK 



SEQ. ID. No.. 23 

>clone#65b (photos ys tern I subunit) 

GGCAC GAGGC AT T T GTCAAGGC TGGC C CAT TAAGGAACAC TCC T TAC GCC 
GGCTCCGCTGGCTCTTTGGCCGCAGCTGGACTCGTAGTCATCCTCAGCAT 
GTGCCT CAC C AT CT ACGGGAT CTCTTCTTT CAAT GAAGGAGAC C C TT CGA 
TCGCACCGAGTTTGACTTTGACCGGACGGAAGAAGCAGCCTGACCAGCTT 
CAGACTGCTGACGGATGGGCTAAGTTCACCGGAGGGTTCTTCTTCGGTGG 
GATCTCTGGCGTGACTTGGGCTTACTTCCTTCTCTACGTTCTTGACCTTC 
CT TAC TAC GT C A&ATGAAT GTAGTTGAAAATATATATGAGT GTACTT TCA 
ACTCTCTCTTGCATCTTTGTTCTTCTTTTGTCTTGATCAAGAATCTTGAA 
T C T TAAGGGAAT GAT TAAT GTATAT TAC TAT GGAT CT TT T CTT AACAT T T 
AATAATTTATATTGCCTTG^AAAAAAAAAAAAAAAA 

SEQ. ID. No,. 24 

>clone#65b-prot (photo system I subumt) 

HEAFVKAG PL RNT P YAGS AG S LAAAGLWTL SMCLT I YGI S S FNE GDP S I 

AP SLTLTGRKKQP DOLQTADGWAKFT (£££FFGGIS GVTWAYFLL YVLDLP 

YYVK 



SEQ. ID. No. 25 

>clone#82 (40S ribosomal protein) 

GG CAC GAGGC GCC GAC GAAGCTC TGCAACAACAAT GGCGACT CAGATCAG 
CAAAAAGAGAAft.GTTCGTTGCCGATGGTGTTTTCTACGCGGAGCTJiAA.CG 
AGG T C C T GACAAGAGAGC T C GC T GAAGAT GGTTAC T CTGGTGT C GAGG T C 
CGTGTCAC T C C CAT GC G TACCGAAAT CAT CATCAGAGC CACTC GTAC T CA 
AAAC GT T C T C G GT GAGAAGGGT AGGAGAAT CATAGAGTT GACAT CACTT G 
T CCAAAAGAGATT CAAAT T TC C T CAGGACAGTGT T GAGCT TTACGCTGAA 
AAA 

SEQ. ID. No. 26 

>clone#8 2-pr6t (4 OS ribosomal protein) 

MATQ IS KKRKFVADGVF YAELNE VL TRELAE DGY S GVEVRVT PMRT E 1 1 1 RATRTQNVL G 
EKGRR1IELTSLVQKRFKFPQDSVELYAEK 
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SEQ. ID. No. 27 

>clone#7 9 (unknown protein) 

GGCACGAGGCCTCGTGCCGAATTCGGCACGAGGGA^GTTAGTAAGAAATC 
AAACCCTTGCAGGCGACTTGGAGATAAGAAGCAGATTGTTTACCAAATGT 
TTTCTGGAACAAGACTTGTGTAAATAGGTGGAATCTTGGTTGGTTTTTGA 
GGTATTCAT CAAAT CTAACACAC TCAAAAGATGGGAT GTGTTT CTT CTT G 
CTT CCGT G T C CAAGACAT T GAT GAGTACATGAAT CCAAGTAGC TCTGTAT 
AT AGGAACT G TCCCT GCAT TAGAT GCC T T GC TCATAAT T T CCT TAAC CT G 
TAT AT C ACGGTAT T C AGGAGAGGGGAAAC CC GAT CT C T C CCGT CAT CAG T 
TCAAGCGA.CTGCATCGATAAGTTCGTCTTCTTTCCACGATAACTTTCTGT 
CTGAAGCATTCCGTTCAACTCCGAGACCTCTGCCTTACGATGCTGATCCA 
AGATACAT C CGCT CACT CG T CTCAAGAAGAGAGAAAGGT TCT AGCCAT TC 
CCATGAGGAAGCTGAGCCTCTAAGAAGCGATGGCGACGCTGCGGATTCCG 
AAT C T T T CAGAGGAT GCAGC AAAT GGGGAAACAACAAAT CC GACAAAGAT 
GCCAAAGAAGA.CT AC TCT AGTAAATC T AGT CTCAGGAT TTC GAAAT CAAA 
GT CAAT GGT T GACACT GAAAGCATT TAT GTATT GT CT GAAGAT GAAGAT G 
T GT GT CC T AC TTGT C T T GAAGAATAT ACAT CAGAGAAT C CAAAGATT GTA 
ACGAAATGC T GT CAC CAT T T CCAT C TTGGT T GCAT TT ATGAAT GGATGGA 
GAGAAGTGAAAACTGTCCTGTCTGCGGAAAGGTGATGGAGTTTAACGAAA 
CACCTTGATCATCGATCATTGATCTGTGTCTTGTATCTCAACTGAAACCG 
GGGAAGAT GAAGAT GACAAGGCAT T GCAAAGGAGAT GT TT TTGTAAATT T 
GGCT T T GT T GGT T T GTGAAT AT TGT CAAT GACAAT GGTAAATATATGAAG 
CAGAAAGGGAGAAAATAT GT TCCT CTGCTTTTCAACAGT T TTAC GACAT T 
GGATAT CT TAAATATTTAAT TACGAATAAT AATATATCAACA&GAGACAA 
GAAAAATACGTTTGTTTAGGTAA 
SEQ. ID. Wo. 28 - _ 
>clone#79-prot (unknora protein) 

MGCVS S C FRVQ D I DE YMNP S S S VYRNC PC I RCLAHNFLNL Y I TVFRRGE TRSL P S S VQAT 
ASrTSSSFHDHFLSEAFRSTPRPLPYDADPRYIRSLVSRREKGSSHSHEEAEPLRSDGDA 
ADSESFRGCSPaVGNNKSDKDAKEDYSSKSSLRISKSKSl^/DTSSIYVLSEDEDVCPTCLE 
EYTSENPKIVTKCCHHFHLGCIYSPvMERSENCPVCGKTMEFNETP 



SEQ . ID. No. 29 ' 
>cione#62 (unknown protein) 4 
GGC AC GAGGC T CAAAT CAGAT C GGT TTC CAT GGC T GCAGCTGCTA-ACACC 
GCCGCCATTTTCGCCTCTCCTTCGCAGCCTTTATCCTCTAAAAGCAGTTT 
T T T GTAC AGC T CAGCGAT T GGT CAAAT ACAAAGGAGATTTC CAAGGAGGA - 
AAC T T GAT C T GC AAGTASAAGC T GT T G C CAC GAC TCT TAC AC C C C T T GAA 
GAGACC AAAGAAT AT A^lGC TAC CTTCAT GGGCAAT GT TCGAGAT GGGGAC 
AGC T CCT GTG T AC T GGAAAACCAT GAACGGTCTTC C TCCAAC CGCAGGAG 
AAAAGT T GAAGC TAT TCTA.TAATC CAGC T GCAACCAAACTCAC T CTTAAC 
GAAGACTATGGAGTTGCT T TCAACGGAGGATTAA.CCAACCAAT CATGTGT 
GGT GGGGAAC CAAGAGCAATGC T TAAGAAAGAT CGAGGCAA&GC CGATT C 
T CC CAT T TACACTAT GCAGATT TGCAT T CCTAAGCAC GCT GT GAATTT GA 
TAT T CT C GT T T ACCAATGGC GT GGACT GGGACGGTC CATACAGAC TTCAG 
T T T C AAGTCCCAAGCGAT GGCAAACAAACCT ATC GAGTT CTT CAATGAAG 
G T CTAGC GAAAGAG T TGAGC CAAGACGGAGCCTGCGAGAGAGC AATATTT 
CCT GACT C GAACGT AGT T GCGACGC GG T GCACAATGATC GC C AACTTGAC 
GGTGGAAGGAGGAGATAGATGCAATCTGGATTTGGTTCCTGGGTGCATGG 
ATACAAATTCGGAACATTTCAACCCGTTTGCTAATGTTGATGATGGCTCC 
T GT CCCCT CGAC T TAT CT GAT T CT GATGAATAGAGCTATAGCATTTTCT T 
AT GTAAAT AT AT GAAC CCAT ATGT T AATAT CAGTACGTAGTAT T TGAAT T 
TAAATATGTATACATGTGGTAACTTGTTGGGTTTTACTATTATATAAGAA 
GCTT CACAAT CAAA 
SEQ. ID. No. 3 0 

>clone#-62~prot (unknown protein) 

MAAAANTAAI FAS P S QPLS SKS S FL YS SAI GQIQRRFPRRKLDLQVKAVATTL TPLEETK 

EYKLPSWAMFEMGTAPVYWKTMNGLPPTAGEia 

QS C WGNQE QCLRKIEAKP ILP FTLCRFAFLS TL 
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SEQ • ID . No P 31 

>clone#215 (unknown protein) 

GGCACGAGGCATCCAAGTCCCGGAGAAATCGATCGTAGCTCGGTGCCTTT 

CGCTTTATAAAATCGC^TCTCGACAGGGGAAGMAAjGTTCGTTGCTTTCC 

CTCAAA_AATCTCCAATTTCTCGTTTCATTTCCGTTAATTTATCGTTTCAC 

CGAACGCACGTCGAAACCCTATAACCCAATTGGTTTTTTGCGGGTCAACT 

TCAGCTTCGAGTTATCTAGGGTTTCGTA.TCTGAATGTGTGGAGAANAAAA 

CCCTTCTTGTGGGGGTTACCTAAATTTTCTGAAATCAGAGCTTTAAAAGG 

GACAGCTT T TAT T T GTAT GGAAGGTC T CTGT CAC TAAAC T ACAT AT TGAT 

ATGGAGGCACAAATTCATCAACTTGAGCAGGAAGCGTATACTGCTGTTTT 

AAGGGCTT T CAAAG CGCAG T CAGA.T GC TAT T T CT T GGGACAAGGAAAGC T 

T GATAAC AGAGC T G C G T AAAGAAT T GAGAG T A.T C T GAT GA.C GAAC AT CGG 

GAGC TGCT GAGTANGG TCAAATAAGG AC GAT ACT AT C C CAAGGAT T TAC G 

GATTGGAGA.CCANGGA.GGC GGAAGTCNAAG T T CCNAGACAT GCAGCTAT T 

C AGC CT T TNT GAAT G TNGGNT C 

SEQ. ID • No • 32 

>clone#215-prot (unknown protein) 

MEAQ IHQLEQEA YTAVIRAFKAQS DAI SWDKESLI TELRKELRVS DDEHR 
ELLSXVK 



SEQ. ID. No. 33 

>clone#114 (metallothionein-like protein) 
GGCA.CGAGGGTGAAATTTCAGCTCAAATCTACGACTGAAAAACTCATTTT 
CAT TGTT T T GTAAGC TAC T TGT T TAAAGCACTTAT CAGAAT GGAC TCAT G 
TT GCAAAAAAGTT TC TT C CGAC TCGAGCT GCAGCGCCAAGCCGACTACAA 
AT T GCAT T T GT GT CCAGAATT C CAACAAAT GC CC C TGCT GTGATAACAAA 
T C AGAGT G T TGCT GCAAGCAGGCGAAT TCC TGCT GCACGAG TAC AAATAA 
TTCAAGCGGCTGTTCTAACCAGGCTAAAACGTGTTGCTCTAAGTAGATGT 
T TG T CAACTA-TG AT TTCAACAT T TT GGACT GAT TACT TTCGAT C T TCGTT 
TGTACGAGTACAAAGTAATATATGTCATTCTTTAATTCATAAGAATTTTA 
C T GGAT CAT T C AAC AT GC ATAT AAAT T T TAT AT GTGCTTCGGC TAT G TAA 
AGTGAACGCAGATGGTGTACAATAAGTTCATGACTGCTTTCTTTACTAGA 
GAGGAAAAAT GAT GATG T T TT CAGCATAGC TGC TAGACCTACATAATATT 
GTAATAAAAT AAAC CACAAAAT GTTAAATATAT T G TACCTTT T ACCAAAA 
AAAAAAAAAAAAAAAAAAAAJ^AAA 
SEQ . ID. No. 3 4 

>clone#114-prot (meta±J.othionein-like protein) 

MDSCCKKVSSDSSCSAKPTTNCICVQNSMKCPCCDNKSECCCKQANSCCT 

STNNSSGCSNQAKTCCSK 



SUBSTITUE SHEET 



